library(astsa)
library(ggplot2)
Plotting the GDP data.
tsplot(gdp)
Plotting the acf of the data.
acf1(gdp)
## [1] 0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.92 0.91 0.90 0.89 0.88 0.87 0.86 0.85
## [16] 0.84 0.83 0.82 0.81 0.80 0.79 0.78 0.77 0.76 0.75 0.74 0.73
Applying transformation over data
tsplot(diff(log(gdp)), ylab="GDP tranformed data", col=4)
abline(h=mean(diff(log(gdp))), col=6)
Considering the difference on the logged data for now.
plotting the acf and pacf plot to determine the p and q values for the ARIMA model.
acf2(diff(log(gdp)))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF 0.36 0.23 0.02 -0.06 -0.13 -0.03 -0.04 -0.01 0.08 0.10 0.02 -0.11 -0.11
## PACF 0.36 0.11 -0.11 -0.07 -0.08 0.07 -0.02 -0.02 0.10 0.04 -0.07 -0.16 -0.02
## [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
## ACF -0.05 -0.08 0.05 0.05 0.11 0.07 0.07 -0.07 -0.05 -0.09 -0.02 0.03
## PACF 0.08 -0.06 0.08 0.01 0.06 -0.02 -0.01 -0.07 0.03 -0.04 0.03 0.05
## [,26] [,27]
## ACF 0.03 0.06
## PACF -0.04 0.03
Inspecting the sample ACF and PACF it feels like there are 3 possibilities.
1) AR(1) model as PACF looks like cutting off after lag 1 and ACF tails off
2) MA(2) model as ACF looks like cutting off after lag 2 and PACF tails off
3) ARMA(1,0,2) on diff(log(gdp)) or ARIMA(1,1,2) on log(gdp).
Trying to fit the model over the data
AR(1)
sarima(log(gdp),p=1,d=1,q=0)
## initial value -4.673186
## iter 2 value -4.742918
## iter 3 value -4.742921
## iter 4 value -4.742923
## iter 5 value -4.742925
## iter 6 value -4.742925
## iter 6 value -4.742925
## final value -4.742925
## converged
## initial value -4.742227
## iter 2 value -4.742232
## iter 3 value -4.742244
## iter 3 value -4.742244
## iter 3 value -4.742244
## final value -4.742244
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 constant
## 0.3603 0.0077
## s.e. 0.0551 0.0008
##
## sigma^2 estimated as 7.599e-05: log likelihood = 950.47, aic = -1894.93
##
## $degrees_of_freedom
## [1] 284
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.3603 0.0551 6.5365 0
## constant 0.0077 0.0008 9.5915 0
##
## $AIC
## [1] -6.625631
##
## $AICc
## [1] -6.625483
##
## $BIC
## [1] -6.587281
MA(2)
sarima(log(gdp),p=0,d=1,q=2)
## initial value -4.672758
## iter 2 value -4.749239
## iter 3 value -4.750696
## iter 4 value -4.750723
## iter 5 value -4.750724
## iter 6 value -4.750725
## iter 7 value -4.750725
## iter 7 value -4.750725
## iter 7 value -4.750725
## final value -4.750725
## converged
## initial value -4.751076
## iter 2 value -4.751078
## iter 3 value -4.751079
## iter 4 value -4.751079
## iter 5 value -4.751079
## iter 5 value -4.751079
## iter 5 value -4.751079
## final value -4.751079
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ma1 ma2 constant
## 0.3070 0.2258 0.0077
## s.e. 0.0579 0.0547 0.0008
##
## sigma^2 estimated as 7.465e-05: log likelihood = 952.99, aic = -1897.98
##
## $degrees_of_freedom
## [1] 283
##
## $ttable
## Estimate SE t.value p.value
## ma1 0.3070 0.0579 5.2987 0
## ma2 0.2258 0.0547 4.1270 0
## constant 0.0077 0.0008 9.8631 0
##
## $AIC
## [1] -6.636309
##
## $AICc
## [1] -6.636011
##
## $BIC
## [1] -6.585176
ARIMA(1,1,2)
sarima(log(gdp),p=1,d=1,q=2)
## initial value -4.673186
## iter 2 value -4.678069
## iter 3 value -4.753134
## iter 4 value -4.754349
## iter 5 value -4.754580
## iter 6 value -4.754748
## iter 7 value -4.755131
## iter 8 value -4.755320
## iter 9 value -4.755405
## iter 10 value -4.755408
## iter 10 value -4.755408
## final value -4.755408
## converged
## initial value -4.754185
## iter 2 value -4.754202
## iter 3 value -4.754216
## iter 4 value -4.754237
## iter 5 value -4.754250
## iter 6 value -4.754253
## iter 7 value -4.754255
## iter 8 value -4.754257
## iter 9 value -4.754259
## iter 10 value -4.754259
## iter 10 value -4.754259
## final value -4.754259
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ma1 ma2 constant
## 0.2668 0.0563 0.1833 0.0077
## s.e. 0.1698 0.1664 0.0735 0.0009
##
## sigma^2 estimated as 7.417e-05: log likelihood = 953.9, aic = -1897.8
##
## $degrees_of_freedom
## [1] 282
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.2668 0.1698 1.5711 0.1173
## ma1 0.0563 0.1664 0.3384 0.7353
## ma2 0.1833 0.0735 2.4951 0.0132
## constant 0.0077 0.0009 8.9732 0.0000
##
## $AIC
## [1] -6.635676
##
## $AICc
## [1] -6.635178
##
## $BIC
## [1] -6.57176
In this ARIMA(1,1,2) has insignificant values as shown by p.value in model fit.
Standardized residuals in all the models shows no pattern.
ACF of the residuals shows no apparent departure from the model assumption apart from AR(1).
P-Value plot shows values exceeding 0.05, we can feel comfortable not rejecting the null hypothesis that the residuals are white apart from AR(1).
Let’s try higher dimension model to see if the value changes
sarima(log(gdp),p=2,d=1,q=2)
## initial value -4.673371
## iter 2 value -4.729951
## iter 3 value -4.753983
## iter 4 value -4.754242
## iter 5 value -4.754684
## iter 6 value -4.754891
## iter 7 value -4.754974
## iter 8 value -4.755021
## iter 9 value -4.755244
## iter 10 value -4.755828
## iter 11 value -4.756123
## iter 12 value -4.756607
## iter 13 value -4.757501
## iter 14 value -4.757843
## iter 15 value -4.757983
## iter 16 value -4.758260
## iter 17 value -4.758444
## iter 18 value -4.758502
## iter 19 value -4.758537
## iter 20 value -4.758615
## iter 21 value -4.758707
## iter 22 value -4.758757
## iter 23 value -4.758767
## iter 24 value -4.758768
## iter 25 value -4.758772
## iter 26 value -4.758773
## iter 27 value -4.758774
## iter 27 value -4.758774
## iter 27 value -4.758774
## final value -4.758774
## converged
## initial value -4.757277
## iter 2 value -4.757298
## iter 3 value -4.757333
## iter 4 value -4.757363
## iter 5 value -4.757387
## iter 6 value -4.757396
## iter 7 value -4.757398
## iter 8 value -4.757403
## iter 9 value -4.757407
## iter 10 value -4.757409
## iter 11 value -4.757412
## iter 12 value -4.757417
## iter 13 value -4.757417
## iter 14 value -4.757421
## iter 15 value -4.757423
## iter 16 value -4.757425
## iter 17 value -4.757426
## iter 18 value -4.757427
## iter 19 value -4.757428
## iter 20 value -4.757432
## iter 21 value -4.757437
## iter 22 value -4.757446
## iter 23 value -4.757499
## iter 24 value -4.757506
## iter 25 value -4.757527
## iter 26 value -4.757533
## iter 27 value -4.757535
## iter 28 value -4.757538
## iter 29 value -4.757566
## iter 30 value -4.757674
## iter 31 value -4.757720
## iter 32 value -4.757874
## iter 33 value -4.758110
## iter 34 value -4.758329
## iter 35 value -4.758837
## iter 36 value -4.759047
## iter 37 value -4.759392
## iter 38 value -4.759404
## iter 39 value -4.759407
## iter 40 value -4.759412
## iter 41 value -4.759415
## iter 42 value -4.759418
## iter 43 value -4.759421
## iter 44 value -4.759425
## iter 45 value -4.759429
## iter 46 value -4.759429
## iter 47 value -4.759430
## iter 48 value -4.759430
## iter 49 value -4.759430
## iter 49 value -4.759430
## iter 49 value -4.759430
## final value -4.759430
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ar2 ma1 ma2 constant
## 1.2423 -0.6409 -0.930 0.4857 0.0077
## s.e. 0.1869 0.1645 0.217 0.1589 0.0007
##
## sigma^2 estimated as 7.338e-05: log likelihood = 955.38, aic = -1898.76
##
## $degrees_of_freedom
## [1] 281
##
## $ttable
## Estimate SE t.value p.value
## ar1 1.2423 0.1869 6.6474 0.0000
## ar2 -0.6409 0.1645 -3.8959 0.0001
## ma1 -0.9300 0.2170 -4.2857 0.0000
## ma2 0.4857 0.1589 3.0566 0.0025
## constant 0.0077 0.0007 10.8968 0.0000
##
## $AIC
## [1] -6.639025
##
## $AICc
## [1] -6.638276
##
## $BIC
## [1] -6.562326
sarima(log(gdp),p=1,d=1,q=3)
## initial value -4.673186
## iter 2 value -4.679737
## iter 3 value -4.754487
## iter 4 value -4.755487
## iter 5 value -4.755626
## iter 6 value -4.755630
## iter 7 value -4.755632
## iter 8 value -4.755643
## iter 9 value -4.755646
## iter 10 value -4.755649
## iter 11 value -4.755651
## iter 12 value -4.755653
## iter 13 value -4.755653
## iter 14 value -4.755654
## iter 15 value -4.755654
## iter 16 value -4.755654
## iter 16 value -4.755654
## iter 16 value -4.755654
## final value -4.755654
## converged
## initial value -4.754738
## iter 2 value -4.754756
## iter 3 value -4.754775
## iter 4 value -4.754819
## iter 5 value -4.754880
## iter 6 value -4.754928
## iter 7 value -4.754972
## iter 8 value -4.755021
## iter 9 value -4.755081
## iter 10 value -4.755125
## iter 11 value -4.755141
## iter 12 value -4.755143
## iter 12 value -4.755143
## final value -4.755143
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ma1 ma2 ma3 constant
## 0.0350 0.2896 0.2649 0.0841 0.0077
## s.e. 0.3515 0.3466 0.1251 0.1058 0.0009
##
## sigma^2 estimated as 7.403e-05: log likelihood = 954.15, aic = -1896.31
##
## $degrees_of_freedom
## [1] 281
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.0350 0.3515 0.0995 0.9208
## ma1 0.2896 0.3466 0.8356 0.4041
## ma2 0.2649 0.1251 2.1179 0.0351
## ma3 0.0841 0.1058 0.7955 0.4270
## constant 0.0077 0.0009 8.9463 0.0000
##
## $AIC
## [1] -6.63045
##
## $AICc
## [1] -6.629701
##
## $BIC
## [1] -6.553751
Values doesn’t change much for higher models.
Looking at the AIC, AICc and BIC MA(2) is the better choice model of the 3.
First lets see the time series
tsplot(gtemp_land)
We can see a clear trend, let’s look at the ACF to confirm this,
acf(gtemp_land)
The slow decay of the ACF suggest us to take a difference, we do that now.
tsplot(diff(gtemp_land))
abline(h=mean(diff(gtemp_land)))
Looks much better.
The next step is to plot the ACF and PACF of this differenced gtemp, and establish orders for the ARIMA model.
par(mfrow = c(2,1))
acf(diff(gtemp_land))
pacf(diff(gtemp_land))
From the above we can see that the ACF appears to cut off at lag 1 and the PACF tails off. This is suggesting the model MA(1) for the difference temp, or ARIMA(0, 1,1) for the gtemp_land. Let use this and fit the model.
par(mfrow = c(1,1))
sarima(gtemp_land, 0, 1, 1)
## initial value -1.569222
## iter 2 value -1.705059
## iter 3 value -1.725361
## iter 4 value -1.739257
## iter 5 value -1.744454
## iter 6 value -1.746115
## iter 7 value -1.747097
## iter 8 value -1.747305
## iter 9 value -1.747868
## iter 10 value -1.747882
## iter 11 value -1.747883
## iter 11 value -1.747883
## final value -1.747883
## converged
## initial value -1.745205
## iter 2 value -1.745226
## iter 3 value -1.745228
## iter 4 value -1.745237
## iter 5 value -1.745239
## iter 5 value -1.745239
## iter 5 value -1.745239
## final value -1.745239
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ma1 constant
## -0.7156 0.0136
## s.e. 0.0623 0.0043
##
## sigma^2 estimated as 0.03033: log likelihood = 44.7, aic = -83.41
##
## $degrees_of_freedom
## [1] 135
##
## $ttable
## Estimate SE t.value p.value
## ma1 -0.7156 0.0623 -11.4862 0.0000
## constant 0.0136 0.0043 3.1608 0.0019
##
## $AIC
## [1] -0.6088048
##
## $AICc
## [1] -0.6081512
##
## $BIC
## [1] -0.5448636
Diagnoses
Standardized Residuals: They have to apparent pattern, and so we can conclude tha they resemble white noise.
ACF of Residuals: Much like the Residuals Vs Time plot the ACF of these appears to resemble the acf of white noise and so we are happy.
QQ Plot: The qqplot is a great fit, even towards the tails, the divergence from the theoretical quantiles is minimal.
Trying more complex models
sarima(gtemp_land, 0, 1, 2)
## initial value -1.569222
## iter 2 value -1.701742
## iter 3 value -1.745835
## iter 4 value -1.750214
## iter 5 value -1.750531
## iter 6 value -1.750831
## iter 7 value -1.750962
## iter 8 value -1.750969
## iter 9 value -1.750970
## iter 10 value -1.750970
## iter 11 value -1.750970
## iter 12 value -1.750970
## iter 13 value -1.750970
## iter 13 value -1.750970
## iter 13 value -1.750970
## final value -1.750970
## converged
## initial value -1.748320
## iter 2 value -1.748343
## iter 3 value -1.748344
## iter 4 value -1.748346
## iter 5 value -1.748347
## iter 5 value -1.748347
## iter 5 value -1.748347
## final value -1.748347
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ma1 ma2 constant
## -0.6593 -0.0715 0.0136
## s.e. 0.0823 0.0765 0.0041
##
## sigma^2 estimated as 0.03014: log likelihood = 45.13, aic = -82.26
##
## $degrees_of_freedom
## [1] 134
##
## $ttable
## Estimate SE t.value p.value
## ma1 -0.6593 0.0823 -8.0160 0.0000
## ma2 -0.0715 0.0765 -0.9352 0.3514
## constant 0.0136 0.0041 3.3223 0.0012
##
## $AIC
## [1] -0.6004225
##
## $AICc
## [1] -0.5991053
##
## $BIC
## [1] -0.5151675
sarima(gtemp_land, 1, 1, 1)
## initial value -1.567602
## iter 2 value -1.676620
## iter 3 value -1.708252
## iter 4 value -1.723086
## iter 5 value -1.733740
## iter 6 value -1.740424
## iter 7 value -1.740507
## iter 8 value -1.740637
## iter 9 value -1.740637
## iter 10 value -1.740639
## iter 11 value -1.740639
## iter 12 value -1.740639
## iter 12 value -1.740639
## iter 12 value -1.740639
## final value -1.740639
## converged
## initial value -1.747901
## iter 2 value -1.747990
## iter 3 value -1.748164
## iter 4 value -1.748453
## iter 5 value -1.748473
## iter 6 value -1.748473
## iter 6 value -1.748473
## iter 6 value -1.748473
## final value -1.748473
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ma1 constant
## 0.1045 -0.7602 0.0136
## s.e. 0.1104 0.0672 0.0041
##
## sigma^2 estimated as 0.03013: log likelihood = 45.15, aic = -82.29
##
## $degrees_of_freedom
## [1] 134
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.1045 0.1104 0.9464 0.3456
## ma1 -0.7602 0.0672 -11.3153 0.0000
## constant 0.0136 0.0041 3.3363 0.0011
##
## $AIC
## [1] -0.6006753
##
## $AICc
## [1] -0.5993581
##
## $BIC
## [1] -0.5154204
Seeing the AIC and AICc, the model MA(1) or ARIMA(0,1,1) is best.
Forecast
sarima.for(gtemp, 10, 0, 1, 1)
## $pred
## Time Series:
## Start = 2010
## End = 2019
## Frequency = 1
## [1] 0.5467731 0.5531159 0.5594588 0.5658016 0.5721445 0.5784874 0.5848302
## [8] 0.5911731 0.5975159 0.6038588
##
## $se
## Time Series:
## Start = 2010
## End = 2019
## Frequency = 1
## [1] 0.09746909 0.10310668 0.10845160 0.11354521 0.11841992 0.12310175
## [7] 0.12761193 0.13196806 0.13618492 0.14027508
Plotting the data
tsplot(birth)
Trying transformations over the data
tsplot(diff(diff(birth,12)))
abline(h = mean(diff(diff(birth,12))))
Plotting the ACF and PACF of the data
acf2(diff(diff(birth,12)))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF -0.3 -0.09 -0.09 0.00 0.07 0.03 -0.07 -0.04 0.11 0.04 0.13 -0.43 0.14
## PACF -0.3 -0.20 -0.21 -0.14 -0.03 0.02 -0.06 -0.08 0.06 0.08 0.23 -0.32 -0.06
## [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
## ACF -0.01 0.03 0.01 0.02 0.00 0.03 -0.07 -0.01 0 0.06 -0.01 -0.12
## PACF -0.13 -0.13 -0.11 0.02 0.06 0.04 -0.10 0.02 0 0.17 -0.13 -0.14
## [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37]
## ACF 0.17 -0.04 0.03 -0.05 -0.09 -0.01 0.19 -0.03 -0.09 -0.02 -0.04 0.17
## PACF 0.07 -0.04 -0.02 0.02 -0.06 -0.07 0.05 0.07 -0.06 0.05 -0.16 -0.01
## [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48]
## ACF -0.14 0.03 -0.05 0.03 0.10 0 -0.10 -0.03 0.06 0.02 0.01
## PACF -0.04 -0.01 -0.03 -0.01 0.01 0 0.03 -0.02 -0.07 0.05 -0.11
Seasons - It appears that at the seasons, the ACF is cutting off at lag 1s (s=12) whereas the PACF is tailing off at lags 1s,2s,3s,4s. These results simply imply an SMA(1)
Non Seasonal - Inpecting the sample ACF and PACF at the first few lags, it appears as though the ACF cuts off at lag = 1, whereas the PACF is tailing off. This suggests an MA(1) within the seasons, p=0 and q=1.
sarima(birth,p=0,d=1,q=1,P=0,D=1,Q=1,S=12,no.constant = TRUE)
## initial value 2.219164
## iter 2 value 2.013310
## iter 3 value 1.988107
## iter 4 value 1.980026
## iter 5 value 1.967594
## iter 6 value 1.965384
## iter 7 value 1.965049
## iter 8 value 1.964993
## iter 9 value 1.964992
## iter 9 value 1.964992
## iter 9 value 1.964992
## final value 1.964992
## converged
## initial value 1.951264
## iter 2 value 1.945867
## iter 3 value 1.945729
## iter 4 value 1.945723
## iter 5 value 1.945723
## iter 5 value 1.945723
## iter 5 value 1.945723
## final value 1.945723
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ma1 sma1
## -0.4734 -0.7861
## s.e. 0.0598 0.0451
##
## sigma^2 estimated as 47.4: log likelihood = -1211.28, aic = 2428.56
##
## $degrees_of_freedom
## [1] 358
##
## $ttable
## Estimate SE t.value p.value
## ma1 -0.4734 0.0598 -7.9097 0
## sma1 -0.7861 0.0451 -17.4227 0
##
## $AIC
## [1] 6.545975
##
## $AICc
## [1] 6.546062
##
## $BIC
## [1] 6.577399
p-values of the residuals are all below 0.05 and thus we cannot say it to be white.
ARIMA(0,1,1) x (0,1,1)_12 does’nt give good results.
let’s try for
ARIMA(0,1,2) x (0,1,1)_12
sarima(birth,p=0,d=1,q=2,P=0,D=1,Q=1,S=12,no.constant = TRUE)
## initial value 2.219164
## iter 2 value 1.999134
## iter 3 value 1.967518
## iter 4 value 1.960356
## iter 5 value 1.951363
## iter 6 value 1.949760
## iter 7 value 1.949597
## iter 8 value 1.949590
## iter 8 value 1.949590
## final value 1.949590
## converged
## initial value 1.936206
## iter 2 value 1.930945
## iter 3 value 1.930890
## iter 4 value 1.930887
## iter 5 value 1.930886
## iter 6 value 1.930886
## iter 6 value 1.930886
## iter 6 value 1.930886
## final value 1.930886
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ma1 ma2 sma1
## -0.3977 -0.1604 -0.7963
## s.e. 0.0514 0.0484 0.0443
##
## sigma^2 estimated as 45.94: log likelihood = -1205.94, aic = 2419.87
##
## $degrees_of_freedom
## [1] 357
##
## $ttable
## Estimate SE t.value p.value
## ma1 -0.3977 0.0514 -7.7433 0.000
## ma2 -0.1604 0.0484 -3.3112 0.001
## sma1 -0.7963 0.0443 -17.9808 0.000
##
## $AIC
## [1] 6.52257
##
## $AICc
## [1] 6.522747
##
## $BIC
## [1] 6.564469
sarima(birth,p=1,d=1,q=1,P=0,D=1,Q=1,S=12,no.constant = TRUE)
## initial value 2.218186
## iter 2 value 2.032584
## iter 3 value 1.982464
## iter 4 value 1.975643
## iter 5 value 1.971721
## iter 6 value 1.967284
## iter 7 value 1.963840
## iter 8 value 1.961106
## iter 9 value 1.960849
## iter 10 value 1.960692
## iter 11 value 1.960683
## iter 12 value 1.960675
## iter 13 value 1.960672
## iter 13 value 1.960672
## iter 13 value 1.960672
## final value 1.960672
## converged
## initial value 1.940459
## iter 2 value 1.934425
## iter 3 value 1.932752
## iter 4 value 1.931750
## iter 5 value 1.931074
## iter 6 value 1.930882
## iter 7 value 1.930860
## iter 8 value 1.930859
## iter 8 value 1.930859
## final value 1.930859
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ma1 sma1
## 0.3038 -0.7006 -0.8000
## s.e. 0.0865 0.0604 0.0441
##
## sigma^2 estimated as 45.91: log likelihood = -1205.93, aic = 2419.85
##
## $degrees_of_freedom
## [1] 357
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.3038 0.0865 3.5104 5e-04
## ma1 -0.7006 0.0604 -11.5984 0e+00
## sma1 -0.8000 0.0441 -18.1302 0e+00
##
## $AIC
## [1] 6.522519
##
## $AICc
## [1] 6.522695
##
## $BIC
## [1] 6.564418
P-values of both the models are decent enough to consider it white but the AIC and AICc values indicate
ARIMA(1,1,1) x (0,1,1)_12 as better model.
sarima.for(birth,12,1,1,1,0,1,1,12,no.constant = TRUE)
## $pred
## Jan Feb Mar Apr May Jun Jul Aug
## 1979 258.3256 281.2730 263.3864 274.2558 270.5740 294.4331 301.2872
## 1980 275.5399
## Sep Oct Nov Dec
## 1979 295.5201 289.1691 272.2940 283.4030
## 1980
##
## $se
## Jan Feb Mar Apr May Jun Jul
## 1979 6.776034 7.913388 8.562644 9.080446 9.546734 9.984553
## 1980 12.274923
## Aug Sep Oct Nov Dec
## 1979 10.402012 10.802780 11.189036 11.562347 11.923961
## 1980
Trying to plot the CPG
tsplot(cpg)
The trend of the values are decreasing continously and almost tending to zero when seen for such a long duration.
logcpg = log(cpg)
modlm = lm(logcpg ~ time(cpg),na.action = NULL)
summary(modlm)
##
## Call:
## lm(formula = logcpg ~ time(cpg), na.action = NULL)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.77156 -0.39840 0.04726 0.42186 1.13129
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1172.49431 27.57793 42.52 <2e-16 ***
## time(cpg) -0.58508 0.01383 -42.30 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6231 on 27 degrees of freedom
## Multiple R-squared: 0.9851, Adjusted R-squared: 0.9846
## F-statistic: 1790 on 1 and 27 DF, p-value: < 2.2e-16
tsplot(modlm$fitted.values)
lines(logcpg)
The fitted line is nearly perfect to the logged data. Let \(log(c_t) = \beta t\) then
\(c_t = e^{\beta t}\)
plot(resid(modlm))
acf2(resid(modlm))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## ACF 0.61 0.47 0.32 0.15 -0.01 -0.10 -0.23
## PACF 0.61 0.16 -0.03 -0.11 -0.15 -0.05 -0.16
By inspecting the acf plot of residuals, I can say that the residuals are not white.
PACF lags at lag=1 and ACF tails off from Part c).
Let’s try AR(1) model.
reg2 = sarima(logcpg,1,0,0,xreg= time(cpg))
## initial value -0.669056
## iter 2 value -0.999488
## iter 3 value -1.088763
## iter 4 value -1.102248
## iter 5 value -1.128914
## iter 6 value -1.131945
## iter 7 value -1.132479
## iter 8 value -1.132525
## iter 9 value -1.132540
## iter 10 value -1.132543
## iter 11 value -1.132545
## iter 12 value -1.132545
## iter 12 value -1.132545
## iter 12 value -1.132545
## final value -1.132545
## converged
## initial value -0.701381
## iter 2 value -0.882862
## iter 3 value -0.886699
## iter 4 value -0.888651
## iter 5 value -0.888966
## iter 6 value -0.889035
## iter 7 value -0.889043
## iter 8 value -0.889045
## iter 9 value -0.889045
## iter 10 value -0.889045
## iter 10 value -0.889045
## iter 10 value -0.889045
## final value -0.889045
## converged
After inspecting the residuals and ACF it looks fairly white.